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ABSTRACT 

Dysregulated bone morphogenetic protein (BMP) 
signaling in endothelial cells (ECs) and pulmonary 
arterial smooth muscle cells (PASMCs) are impli- 
cated in human genetic disorders. Here, we gene- 
rated genome-wide maps of Smad1/5 binding sites 
in ECs and PASMCs. Smad1/5 preferentially bound 
to the region outside the promoter of known genes, 
and the binding was associated with target gene 
upregulation. Cell-selective Smad1/5 binding patterns 
appear to be determined mostly by cell-specific differ- 
ences in baseline chromatin accessibility patterns. 
We identified, for the first time, a Smad1/5 binding 
motif in mammals, and termed GC-rich Smad binding 
element (GC-SBE). Several sequences in the 
identified GC-SBE motif had relatively weak affinity 
for Smad binding, and were enriched in cell type- 
specific Smad1/5 binding regions. We also found that 
both GC-SBE and the canonical SBE affect binding 
affinity for the Smad complex. Furthermore, we char- 
acterized EC-specific Smad 1/5 target genes and 
found that several Notch signaling pathway-related 
genes were induced by BMP in ECs. Among them, a 
Notch ligand, JAG1 was regulated directly by Smad1/5, 
transactivating Notch signaling in the neighboring 
cells. These results provide insights into the molecular 
mechanism of BMP signaling and the pathogenesis 
of vascular lesions of certain genetic disorders, 
including hereditary hemorrhagic telangiectasia. 

INTRODUCTION 

Bone morphogenetic proteins (BMPs) are members of 
the transforming growth factor-(3 (TGF-(3) family, which 



regulate a variety of cellular processes including differen- 
tiation, proliferation, migration and cell death in a cell type- 
specific and context-dependent manner (1). Perturbations 
of BMP signaling pathways have been implicated in a 
diverse set of developmental disorders, tumorigenesis 
and diseases including ectopic ossification and cardiovas- 
cular diseases. Mutations in ENG, ACVRL1 or SMAD 4 
genes have been shown to cause hereditary hemorrhagic 
telangiectasia (HHT) (2-4), which is a multisystemic vas- 
cular disorder characterized by epistaxis, telangiectases 
and arteriovenous malformation (AVM). The ACVRL1 
gene encodes an endothelial-specific type I receptor for 
TGF-(3 members, ALK-1, whose signals are transmitted 
through BMP-specific receptor-regulated Smads (BR- 
Smads; Smadl/5/8) (5). Recent work has indicated that 
haploinsufficiency of ALK-1 causes HHT (6). The ENG 
gene encodes Endoglin, which is an endothelial expressed 
co-receptor and modulates ALK-1 signaling (7). The 
SMAD4 gene encodes a common mediator Smad (co- 
Smad), which makes a heterotrimeric complex with BR- 
Smads and regulate transcription of specific target genes 
(8). Therefore, dysregulated BMP signaling through 
ALK-1 in endothelial cells (ECs) is implicated in the patho- 
genesis of HHT. Interestingly, BMP signaling activated by 
BMP type I receptors, other than ALK-1 in ECs, are not 
able to compensate for the loss of function of ALK-1. On 
the other hand, aberrant BMP signaling through BMP 
type II receptor (encoded by BMPR2), especially in pul- 
monary arterial smooth muscle cells (PASMCs), are 
implicated in the pathogenesis of pulmonary arterial 
hypertension (PAH) (9,10). Therefore, readout of BMP 
signaling depends on the strength of BMP signaling, 
types I and II receptors and co-receptors, and cell types. 

A binding sequence for BR-Smad was originally identi- 
fied in Drosophila. Kim and colleagues (11) indicated that 
GCCGnCGC is a consensus binding sequence for Mad 
{Drosophila Smadl). In mammals, similar GC-rich 
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sequences, e.g. GCCG or GGCGCC, have been evaluated 
in the promoter regions of well-known BMP target genes 
(12,13). Recombinant protein of the DNA binding domain 
of mouse Smadl (Smadl MH1) has been shown to bind to 
the GGCGCC sequence in vitro (14). This GGCGCC 
sequence is widely accepted as a binding sequence for 
BR-Smads, while a binding motif for BR-Smad has not 
been clearly defined in mammals. 

Recent advances in microarray and sequencing technolo- 
gies have made it possible to analyze global gene expression 
profiles and genome-wide maps of protein binding sites or 
epigenetic marks (15). Two groups have reported genome- 
wide analyses of BR-Smads in mouse ES cells (mESCs) 
using chromatin immunoprecipitation (ChIP) coupled with 
promoter array (ChlP-chip) and ChIP followed by se- 
quencing (ChlP-seq) analyses (16,17). Through profiling of 
the global binding sites of 13 transcription factors and 2 
transcription regulators in mESCs, Chen and colleagues 
(16) hypothesized that Smadl makes an enhancer complex 
with Sox2-Oct4 (also known as Pou5Fl), which defines 
ES-specific binding patterns of Smads. However, it has not 
been clarified whether a transcription factor complex, or an 
enhancer complex, determines the cell type-specific binding 
patterns of Smads in other cell types. 

Here, we performed ChlP-seq to map Smadl/5 occu- 
pancy at high resolution in two different primary human 
cells treated with several BMP isoforms; human umbilical 
vein endothelial cells (HUVECs) with BMP-9 or BMP-6 
and PASMCs with BMP-4. Smadl/5 preferentially bound 
to the region outside the promoter of known genes, and 
their binding was associated with upregulation of target 
genes. In HUVECs, Smadl/5 binding regions overlapped 
with reported enhancer regions. Comparison of HUVECs 
and PASMCs revealed that about 20% of the binding re- 
gions were overlapped. In contrast, most of the Smadl/5 
binding sites in HUVECs treated with BMP-6 overlapped 
with those with BMP-9, especially in the regions with 
higher affinity for Smads. Cell-selective Smadl/5 binding 
patterns appear to be determined mostly by cell-specific 
differences in baseline chromatin accessibility patterns. In 
addition, a Smadl/5 binding motif was identified and 
termed a GC-rich Smad Binding Element (GC-SBE). 
Interestingly, GGAGCC sequence was enriched in the 
HUVEC- or PASMC-specific Smadl/5 binding regions 
compared with the GGCGCC sequence. We revealed that 
mutations of GC-SBE affected binding of Smad complex 
in a cell type-specific manner. Furthermore, we character- 
ized EC-specific Smadl/5 target genes and found that sev- 
eral Notch signaling pathway-related genes were induced 
in ECs. Among them, a Notch ligand, JAG1 was regulated 
directly by Smadl/5, transactivating Notch signaling in 
the neighboring cells. These results provide insights into 
the molecular mechanism of BMP signaling and the 
pathogenesis of vascular lesions of HHT. 

MATERIALS AND METHODS 

Cell culture 

HEK293T, HepG2 and HeLa cells were obtained from the 
American Type Culture Collection (ATCC). HUVECs 



and PASMCs were obtained from Lonza. HMEC-1, an 
immortalized human dermal microvascular EC line, was 
obtained from Dr T. Lawley (Emory University, Atlanta, 
GA, USA). 293T, HepG2 and HeLa cells were maintained 
in Dulbecco's modified Eagle's medium (Gibco), supple- 
mented with 10% (v/v) fetal bovine serum (FBS) 
(Hy Clone) and 1% penicillin-streptomycin (Gibco). 
HUVECs and HMEC-1 were cultured in EGM-2 
medium (Lonza). PASMCs were cultured in SmGM-2 
(Lonza). 

Reagents and antibodies 

Recombinant human BMP-4, BMP-6 and BMP-9 were 
purchased from R&D Systems. TNF-oc was from 
PeproTech. Cycloheximide (CHX) was purchased from 
Sigma-Aldrich. 

The following antibodies were used: anti-Flag M2 
(Sigma-Aldrich), anti-oc-tubulin (AC- 15; Sigma-Aldrich), 
anti-HDAC-1 (2E10; Upstate Millipore), anti-Smadl 
(Bio Matrix Research, Chiba, Japan), which recognizes 
both Smadl and Smad5 for ChIP procedure, anti- 
Smadl/5/8 (N-18; Santa Cruz Biotechnology) for 
western blotting, anti-phospho-Smadl/5 (Cell Signaling 
Technology), anti-phospho-Smad 1/5/8 (Cell Signaling) 
and anti-JAGl (H-114; Santa Cruz). 

ChIP 

Chromatin isolation, sonication and immunoprecipitation 
(IP) using anti-Smadl/5 antibody were performed essen- 
tially as described (18). 

ChlP-sequencing (ChlP-seq) and data analysis 

High-throughput sequencing of the ChIP fragments was 
performed using Illumina Genome Analyzer (Illumina) 
following the manufacturer's protocols. One flow cell 
lane was used for sequencing of each pooled sample. 
Unfiltered 36 bp sequence reads were aligned against the 
human reference genome (NCBI Build 36, hgl8) using 
ELAND (Illumina). Peaks were called using CisGenome 
vl.2 (19) by two-sample analysis, where input genomic 
DNA was used as a negative control (Supplementary 
Table SI). Assigning a binding site to the nearest gene 
within lOOkb from a peak was performed using 
CisGenome. 

A set of random genomic control regions for 
3750 Smadl/5 binding regions was generated by 
randomly picking up the same number of 301 bp 
chromosome-matched sequences. In order to calculate 
the frequency of transcription factor binding site (TFBS)- 
positive sequences, MATCH score of position-specific 
scoring matrix (PSSM) for each transcription factor was 
computed. The highest MATCH score (HMS) was 
assigned to each sequence, and the number of sequences 
with HMS greater than or equal to a threshold was 
counted. For obtaining background data against those of 
3750 Smadl/5 binding regions, the chromosome-matched 
sequences were generated randomly for 1000 times. The 
distribution of HMSs in 1000 sets of 3750 sequences was 
used as background control for each PSSM. The threshold 
was set to the mode of the distribution of HMSs. PSSMs 
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were obtained from JASPAR database (20). A set of 
non-overlapping matched genomic control sequences was 
generated by CisGenome. The frequency of TFBSs in these 
sequences (motif counts per set of sequences) was 
computed with the likelihood ratio >500 (default value of 
CisGenome) or 200 (for shorter motifs such as MEME4). 
Fifty sets of matched control sequences were used as back- 
ground data. Mapping of TFBSs to the specific genomic 
regions were calculated by the CisGenome. 

The sequences of Smadl/5 binding sites were input to 
the MEME (21) with several options: mod = oops, 
nmotifs = 5, minw = 6, maxw = 8, revcomp and other set- 
tings default. The logo plots were generated using the 
seqLogo package in R (http://bioconductor.org/ 
packages/2. 6/bioc/html/seqLogo. html). Enriched bind- 
ing motifs were also obtained from the Cis- 
regulatory Element Annotation System (CEAS) website 
as described (http://ceas.cbi.pku.edu.cn/index.html) 
(22,23). Overrepresented gene ontology (GO) categories 
for genes associated with Smadl/5 binding regions were 
determined using the Database for Annotation, 
Visualization and Integrated Discovery (DAVID v6.7; 
http://david.abcc. ncif erf. go v) (24) . 

ChIP and quantitative-PCR 

The real-time PCR was conducted as described (23). 
Primer sequences are given in Supplementary Table S2 
in the Supplementary Data. The amount of immunopre- 
cipitated DNA was calculated relative to the input. 

RNA isolation, quantitative real-time reverse 
transcription-PCR and conventional RT-PCR 

Extraction of total RNA, qRT-PCR and conventional 
RT-PCR were performed as described (23). Primer se- 
quences are given in Supplementary Table S2. 

Gene expression profiling 

HUVECs and PASMCs were serum starved overnight and 
treated with or without BMP-9 (1 ng/ml), BMP-6 or 
BMP-4 (50 ng/ml) treatment for 2 or 24 h. Gene expres- 
sion profiling was performed with a GeneChip Human 
Genome U133 Plus 2.0 Array (Affymetrix) as described 
(18). The 8544 and 8067 genes, whose signal intensity 
exceeded 100 at any time point were considered to be 
expressed and functional in HUVECs and PASMCs, re- 
spectively. The heatmaps were produced using the 
heatmap.2 function from the gplots library in R (http:// 
cran.r-project.org/web/packages/gplots/). 

Histone modification data 

Genome-wide histone modification map for mono- 
methylation of histone H3 lysine 4 (H3K4mel), trimethyl- 
ation of histone H3 lysine 4 (H3K4me3) and acetylation of 
histone H3 lysine 27 (H3K27ac) of HUVECs were pro- 
duced and released from the ENCODE Project (25) and 
were downloaded from UCSC (http://hgdownload.cse 
.ucsc.edu/goldenPath/hg 1 8 /encodeDCC/ 
wgEncodeBroadChipSeq/). 



Plasmid construction 

FLAG-tagged Smad constructs were previously described 
(18). Each fragment of Smadl/5 binding regions was 
amplified from human genomic DNA by PCR, cloned into 
a modified pGL4.10 reporter plasmid (Promega) driven by 
minimal adenoviral major late promoter (MLP) (12). A 
point mutation was introduced by site-directed mutagen- 
esis using PCR with specific primers. A reporter construct 
with six multimerized CTGGAGCC sequence (pGL4- 
6xGC-SBE-Luc) was constructed as follows. A fragment 
with one copy of the CTGGAGCC sequence was cloned 
into the pcDNA3.1 vector (Invitrogen). The sequences of 
the oligonucleotides were S'-AGATCTTCGAACAGCTC 
TGGAGCCAGATGGCCTGGATCC-3 / (sense) and 5'-G 
GATCCAGGCCATCTGGCTCCAGAGCTGTTCGAA 
GATCT-3' (antisense). This fragment was multimerized in 
tandem, and the fragment containing six tandem copies 
was subcloned into the modified pGL4-MLP plasmid. Six 
multimerized dimeric CBF1 /Suppressor of Hairless/Lagl 
(CSL) binding sites with Epstein-Barr virus (EBV) TP1 
promoter sequence of the pGA981-6 (26) was transferred 
into pGL4.10 reporter plasmid (pGL4-12xCSL-Luc) and 
used for Notch reporter assay. A plasmid encoding GST- 
hSmadl-MHl was constructed by PCR amplification of 
the MH1 domain of human Smadl (1-143 amino acid). 
The fragment was subcloned into pGEX-6P-l vector (GE 
Healthcare, Chalfont St Giles, UK). All constructs were 
DNA sequence verified. 

Protein production and purification 

The bacterially expressed GST fusion proteins contain- 
ing residues for human Smadl i_i 43 (GST-hSmadl-MHl) 
were purified with Glutathione Sepharose 4B beads (GE 
Healthcare) followed by cleavage with PreScission 
Protease (GE Healthcare) at 4°C overnight according to 
the recommendations of the manufacturer. The concentra- 
tion of the protein was measured by BCA Protein Assay 
Kit (Pierce). 

Electrophoretic mobility shift assays 

Electrophoretic mobility shift assays (EMSA) was con- 
ducted essentially as described previously (14) and 
detected with LightShift Chemiluminescent EMSA kit 
(Pierce). The sequences of the DNA oligos are provided 
in Supplementary Table S2. 

Lentiviral infection and luciferase assays 

Since transfection efficacy is very low in HUVECs and it 
is rather toxic, we adapted lentiviral expression system. 
pGL4 constructs were subcloned between EcoRI and 
Xhol sites of the lentiviral vector construct CS-CDF- 
CG-PRE. Recombinant lentiviral vectors were generated 
as reported previously (23). 

Stably expressing cells were stimulated with indicated 
doses of BMP-9 or BMP-6, and then they were harvested 
and assayed for luciferase activity at 12 h after stimula- 
tion. Luciferase activities of the cell lysates were deter- 
mined using the Dual-luciferase Reporter Assay System 
(Promega). 
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Transient transfection and dual-luciferase assays 

Transient transfection was carried out using FuGENE 6 
(Roche) for HEK293T cells, Lipofectamine 2000 
(Invitrogen) for HepG2 cells and Lipofectamine LTX 
(Invitrogen) for HMEC-1 cells and PASMCs according 
to the recommendations of the manufacturer. 

Cells were transiently transfected with 1 |ig of the 
luciferase reporter constructs along with 0.05 |ig of Renilla 
luciferase reporter vector pGL4.74[hRluc/TK] (Promega) 
as an internal control. For HMEC-1 and PASMCs, the 
medium was changed at 3 h after transfection. Cells were 
stimulated with 1 ng/ml BMP-9 (HMEC-1), 50ng/ml 
BMP-6 (HepG2) or 30 ng/ml BMP-4 (PASMCs) at 24 h 
after transfection, and then they were harvested and 
assayed for luciferase activity at 16 h after stimulation. 

RNA interference 

Duplexes of small interfering RNA (siRNA) against human 
Smad4 (D-003902-05) were synthesized by Dharmacon 
(Thermo Fisher Scientific), and were transfected using 
Lipofectamine RNAiMAX (Invitrogen) according to the 
recommendations of the manufacturer. The final concen- 
tration of siRNA in the culture media was 10 nM. At 24 h 
after transfection, cells were serum starved overnight, 
treated with or without BMP-9 for 2 or 24 h and subjected 
to qRT-PCR. 

Western blotting 

Western blotting was performed essentially as described 
(18). Cytoplasmic and nuclear fractions were isolated 
using NE-PER Nuclear and Cytoplasmic Extraction 
Reagents (Pierce) according to the recommendations of 
the manufacturer. 

Immunofluorescence microscopy 

HUVECs were treated with 1 ng/ml BMP-9 for 24 h, fixed 
in 10% formalin for 20min and incubated overnight at 
4° C with primary antibodies (JAG 1,1:100 dilution) diluted 
in Blocking One solution (Nacalai Tesque, Kyoto, Japan). 
The cells were washed with PBST (PBS with 0.1% Triton 
X-100), and then incubated with secondary antibodies 
(Alexa Fluor 488 goat anti-rabbit IgG, Invitrogen, 1:500 
dilution) for 2h and TOTO-3 (Invitrogen) for lOmin at 
room temperature. Images were obtained with a Zeiss 
LSM 510 Meta confocal microscope (Carl Zeiss). 

Transactivation (coculture) Notch assay 

One day prior to transfection, HeLa cells were seeded at 
a density of 5.0 x 10 4 cells per well in 12-well plate. Cells 
were transiently transfected with 1 |ig of the pGL4- 
12xCSL-Luc reporter construct along with 0.05 |ig of 
pGL4.74[hRluc/TK] (Promega) using Lipofectamine 
2000 (Invitrogen). After 16 h transfection, medium was 
changed to 1 : 1 mixture of DMEM and EGM-2 and then 
1.0 x 10 5 HUVECs were added. After adhesion of HUVECs 
(about 2 h later), cells were treated with or without 5 ng/ml 
BMP-9 for 24 h, and subjected to luciferase assay. 



Statistical analysis 

The difference between experimental groups of equal vari- 
ance was analyzed using Student's Mest with P<0.05 
being considered significant. All experiments were per- 
formed at least three times independently and similar 
results were obtained. 



RESULTS 

Genome-wide identification and characterization of 
Smadl/5 binding sites in HUVECs and PASMCs 

We performed ChlP-seq analyses using HUVECs stimulated 
with BMP-9 (1 ng/ml) or BMP-6 (50 ng/ml) and PASMCs 
treated with BMP-4 (50 ng/ml). Doses of the ligands for 
HUVECs were determined based on the phosphorylation 
status of BR-Smads and the physiological range of the cir- 
culating ligands. BMP-9 has been identified as a major cir- 
culating ligand for ALK-1 (27). Serum concentration of 
BMP-9 ranges from 1 to 12 ng/ml, which is enough for full 
activation of ALK-1 (Supplementary Figure SI A) (28,29). 
Thus, it is thought to play important roles in the control 
of vascular quiescence. BMP-6 transduces its signal mainly 
through the BMP type I receptor ALK-2 (encoded by 
ACVR1), which is also a receptor for BMP-9 (1). 
Notably, BMP-6 exists in FBS at concentrations of 
2-10 ng/ml (29), and BMP-6 has been reported to activate 
ECs (30). However, 10 ng/ml was not enough to activate 
Smadl/5 in HUVECs (Supplementary Figure SI A). We 
selected a BMP-6 concentration of 50 ng/ml for our ex- 
periments, which gave an equivalent induction of ID1 
mRNA (Supplementary Figure SIB), and almost as high 
phospho-Smadl/5/8 level in the nuclear fraction as stimu- 
lation with 1 ng/ml BMP-9 (Supplementary Figure SIC). 
We also confirmed that 50 ng/ml BMP-4 was enough for 
full activation in PASMCs (Supplementary Figure SID). 
Both HUVECs and PASMCs expressed Smadl and 
Smad5 (Supplementary Figure S1E). 

The anti-Smadl/5 antibody worked efficiently in IP under 
formalin-fixed condition (Supplementary Figure S1F and 
S1G). Human genomic DNA sequences that corresponded 
to known BMP responsive elements (BREs) in mouse 
Idl (12) and mouse Heyl (13) promoters were used as 
positive control regions. In HUVECs, a comparable en- 
richment of Smadl/5 was confirmed at the ID1 promoter 
after BMP-6 or BMP-9 treatment, while weak Smadl/5 
binding was observed at the HEY1 promoter after 
BMP-6 stimulation (Supplementary Figure S1H). Since 
maximal Smadl/5 binding was observed at 1.5 h after 
stimulation with BMPs, we adopted this time of stimula- 
tion for ChlP-seq analyses, the same stimulation time 
that was used in similar studies of Smad2/3 (18) and 
Smad4 (31). 

The ChIP DNA and the control input DNA were then 
submitted to high- throughput sequencing analyses. The 
enriched genomic regions were determined using 
CisGenome (19). Using a false discovery rate (FDR) cut 
off of 0.1, a total of 3750 Smadl/5 binding regions were 
identified in the ChlP-seq data of HUVECs treated with 
BMP-9, 880 in HUVECs treated with BMP-6 and 2745 in 
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PASMCs treated with BMP-4 (Figure 1A and B; 
Supplementary Figure SI I). To validate the results, 
BMP-9-dependent Smadl/5 enrichment was confirmed 
by ChlP-qPCR at 20 novel Smadl/5 binding regions of 
variable peak intensity (Figure 1C). The ChlP-seq peaks 
were annotated to a total of 2179, 563 and 1,609 genes, 
respectively (Supplementary Table SI). Approximately 
30% of the binding sites were located in the introns of 
known genes and 20% in the promoter regions within 
lOkb upstream of known transcription start sites (TSSs) 
(Figure ID). These Smadl/5 binding regions were highly 
conserved among multiple vertebrate species 
(Supplementary Figure SI J). 

Comparison of the three ChlP-seq data revealed that 
~20% of Smadl/5 binding regions overlapped between 
HUVECs and PASMCs (Figure 2A), while most of the 
Smadl/5 binding sites in HUVECs after BMP-6 stimula- 
tion overlapped with those after BMP-9 stimulation, espe- 
cially in the higher ranked peaks (Figure 2 A and B). 
Common Smadl/5 binding sites shared with HUVECs 
treated with BMP-9 and those with BMP-6, including 
those at ID1 and ID3 loci, had comparable levels of 
Smadl/5 binding, suggesting that these sites had higher 
affinity for Smadl/5, while the BMP-9 specific sites 
(Figure 1A and B and Supplementary Figure SI I) had 
weaker affinity. In line with this hypothesis, increasing 
concentrations of BMP-6 dose dependency enhanced the 
Smadl/5 binding to the BMP-9 specific binding sites, e.g. 
at HEY1 and JAG1 loci (Figure 2C), whereas common 
Smadl/5 binding sites, e.g. at ID1 and ENG loci, had 
enough enrichment when stimulated with only 20ng/ml 
BMP-6 (Figure 2C). These dose response data also indi- 
cated that 50ng/ml BMP-6 was not enough to achieve 
Smadl/5 binding to target sites with relatively lower 
affinity for Smadl/5 in ECs. 

Smadl/5 bind to enhancer regions already accessible in 
specific cell types 

To investigate the biological functions associated with 
Smadl/5 binding in HUVECs and PASMCs, the signifi- 
cance of functional annotation clustering of the GO of the 
genes related to Smadl/5 binding was assessed using 
DAVID (24). This analysis showed that the highest en- 
riched GO category of biological function for HUVEC- 
specific genes with BMP-9 stimulation was related to 
blood vessel development, while that for PASMC- 
specific genes with BMP-4 stimulation was related to extra- 
cellular matrix production (Figure 3A). Thus, Smadl/5 
bind to different sets of target sites in different cell 
types, which may be related to the cell type-specific 
function. 

In order to identify the cell type-specific binding mech- 
anism for Smad complex, we sought known binding 
motifs enriched in the Smadl/5 binding regions using 
the CEAS website (22). Interestingly, ETS, AP-1, AP-2 
and SP-1 binding sites were enriched in the Smadl/5 
binding regions in both HUVECs and PASMCs, while 
other motifs occurred only in a small proportion of se- 
quences analyzed (Supplementary Table S3). We also 
conducted de novo motif prediction in order to find 



overrepresented motifs in the HUVEC- and 
PASMC-specific Smadl/5 binding regions using MEME 
(21). Obtained motifs were then compared with 
TRANSFAC (32) and JASPAR (20) database of known 
motifs, and ranked by their similarity using the 
TOMTOM program (33). The predicted motifs were 
similar to the ones identified by CEAS (Supplementary 
Figure S2). These results suggest that these transcription 
factors do not determine the cell type-specific BR-Smad 
binding pattern. 

To evaluate the association between Smadl/5 binding 
and gene expression regulation, expression microarray 
analyses were performed at several time points (0, 2 and 
24 h). We confirmed an equivalent induction of ID 
proteins after BMP-6 or BMP-9 stimulation in HUVECs 
(Supplementary Figure S3 A). Combining the mapping 
data with gene expression profiles revealed that Smadl/5 
binding regions were enriched in early upregulated genes 
(Figure 3B and C). Notably, in HUVECs treated with 
BMP-9, 108 genes were upregulated and 37 were down- 
regulated more than 2-fold in early phase, and 70 of the 
108 upregulated genes (64%) and 9 of the 37 down- 
regulated genes (24%) were associated with Smadl/5 
binding regions (Supplementary Figure S3B). We consider 
these 70 upregulated genes (corresponding to 170 binding 
sites) as putative direct target genes of ALK-1 in ECs 
(Supplementary Table S4). We also identified 19 putative 
direct target genes in PASMCs using the same criteria 
(Supplementary Table S4). 

Smadl/5 binding regions in HUVECs were further char- 
acterized using differential histone modification marks, 
which were produced and released from the ENCODE 
Project (25). H3K4me3 is associated with promoters and 
H3K4mel is preferentially associated with enhancers. 
H3K27ac is associated with active regulatory regions 
(34). As many as 3651 Smadl/5 binding peaks (97.4%) 
overlapped with H3K4mel or H3K4me3 regions of 
HUVECs. Among them, 3201 Smadl/5 binding peaks 
(85.4%) overlapped with enhancer regions, characterized 
with both H3K4mel and H3K27ac (Figure 3D and 
Supplementary Figure S3C) (34). In PASMCs, 86.5% 
(724/837) of common Smadl/5 binding peaks shared with 
HUVECs and PASMCs overlapped with enhancer regions 
of HUVECs characterized with both H3K4mel and 
H3K27ac, while only 54.3% (1036/1908) of PASMC- 
specific peaks overlapped with endothelial enhancers 
(Figure 3D). Thus, these data also suggest that Smadl/5 
preferentially bind to enhancer regions already accessible 
in specific cell types. 

GC-SBE is a direct binding motif for Smadl/5 

To elucidate a specific binding motif in Smadl/5 binding 
regions, a de novo motif prediction was performed using 
MEME (21). Since ChIP experiments may detect indirect 
Smadl/5-DNA binding through protein-protein inter- 
action, we focused on the 170 Smadl/5 binding regions 
of BMP-9 target genes in HUVECs. Five overrepresented 
motifs were identified and designated as MEME 1-5 
(Figure 4 A and Supplementary Figure S4A). These 
motifs were validated in three ways. First, the fold 
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enrichment of the motifs was compared (Figure 4B and 
Supplementary Figure S4B). Randomly selected genomic 
sequences (n = 1000) or non-overlapping matched regions 
(n = 50) were used as background controls. Four out of 
five motifs were significantly enriched in the Smadl/5 
binding regions, and the MEME2 was the best. 



TFAP2A (also known as AP-2oc) binding motif was a 
positive control and was found to be enriched in the 
Smadl/5 binding regions (Supplementary Table S3). 
No statistically significant differences were observed 
for motifs of transcription factors known to be ex- 
pressed and functional in ECs, such as GATA2 (35). In 
contrast to the study of Chen and colleagues (16), the 
motifs for SOX2 and POU5F1 (also known as OCT4) 
were not enriched in the Smadl/5 binding regions, sug- 
gesting that different mechanisms or different enhancer 
complexes are adopted in differentiated ECs compared 
with mESCs. In addition, the incidence of the MEME 
motifs in the peaks was calculated. MEME2 occurred in 
about 45% of all Smadl/5 binding regions in HUVECs 
and PASMCs (Figure 4C and Supplementary Figure 
S4C). Moreover, it was enriched in the higher ranked 
peaks in HUVECs treated with BMP-9 (Supplementary 
Figure S4D). Finally, the relative distribution of the 
motif around the peak summits, where Smadl/5 was 
expected to be located, was analyzed. MEME2 was 
enriched in the Smadl/5 binding regions, especially around 
the peak summits, while other MEME motifs were not 
(Figure 4D and Supplementary Figure S4E and F). We 
therefore designated MEME2 as GC-SBE because it is 
similar in sequence to the previously reported GC-rich 
sequences for BR-Smads (11-13). 

Analysis of the frequency of GC-SBE sequence in 
Smadl/5 binding regions revealed that GGCGCC 
sequence was enriched in Smadl/5 binding regions shared 
with HUVECs and PASMCs, while GGAGCC sequence 
was enriched in both HUVEC- and PASMC-specific 
binding regions (Figure 5A). To validate the enhancer 
activity of the Smadl/5 binding regions and the ef- 
fects of the newly identified GC-SBE on the cell type spe- 
cificity, luciferase assays were performed in HUVECs. 
Both BMPR2 and JAG1 were HUVEC-specific target 
genes (Supplementary Table S4). Fragments from 
Smadl/5 binding regions in intron 3 of BMPR2 and the 
JAG1 promoter, which contain the GGAGCC sequence, 
were cloned into a luciferase reporter construct 
(Supplementary Figure S5A). Both BMP-9 and BMP-6 
were able to activate these reporters in HUVECs, 
while BMP-4 induced only weak response in PASMCs 
(Figure 5B and Supplementary Figure S5B and S5C). 
Consistent with ChIP data (Figure 1C), the Smadl/5 
binding regions induced higher luciferase expression fol- 
lowing treatment with BMP-9 compared with BMP-6. 
Even 1 ng/ml BMP-9 induced stronger luciferase activities 
in HUVECs than 50 or 200 ng/ml BMP-6 (Figure 5B). We 
also confirmed that these Smadl/5 binding fragments 
worked as transcriptional enhancers in the human micro- 
vascular endothelial cell line, HMEC-1 (Supplementary 
Figure S5D). 

In order to compare the difference of enhancer activities 
between GGAGCC and GGCGCC sequence, a point 
mutation was introduced at the 'A' in the GGAGCC 
sequence. A mutation to GGCGCC induced higher 
luciferase expression compared with the GGAGCC wild- 
type. In contrast, a mutation to GGGGCC attenuated 
BMP responsiveness (Figure 5B and Supplementary 
Figure S5B). Interestingly, the fragments with GGAGCC 
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sequence did not respond to BMP stimulation in 
PASMCs, whereas mutation to GGCGCC showed a 
higher responsiveness (Supplementary Figure S5C). 
Luciferase assays were also performed in HepG2 cells 
to examine the cell type specificity of the fragments. 
Similarly, the GGCGCC mutant responded very well 
compared with wild-type and the T-mutant, while the 



G-mutant had no enhancer activity in HepG2 
(Supplementary Figure S5E). 

We next showed the direct binding of recombinant 
human Smadl MH1 (rhSmadl MH1) to the GGAGCC 
sequence using EMS As. The amino acid sequence of 
rhSmadl MH1 is identical to the corresponding sequence 
of mouse Smadl MH1, which was reported to bind to the 
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Figure 5. Validation of GGAGCC sequence as a novel BMP responsive element. (A) Frequency of GC-SBE sequences in the Smadl/5 binding 
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GGCGCC sequence (14). rhSmadl MH1 was able to bind 
to the GGAGCC probe and this binding was blocked by 
wild-type oligonucleotide but not by the mutated one 
(Figure 5C). The effects of single-point mutation in the 
GGAGCC sequence were also examined. The GGCGCC 
sequence competed more efficiently than the GGAGCC 
sequence, suggesting that this sequence had higher affinity 
for binding to rhSmadl MH1 (Figure 5C). Thus, these 
results showed that the GGAGCC sequence is also a 



direct binding motif for Smadl/5 and that GC-SBE is a 
generalized form of the previously reported GC-rich 
sequences. 

Both GC-SBE and SBE are required for full BMP 
responsiveness 

In Drosophila, Dpp (Decapentaplegic; Drosophila BMP 
orthologs)-responsive elements are shown to contain a 
GC-rich Mad binding site and a flanking GTCT Medea 



8722 Nucleic Acids Research, 2011, Vol 39, No. 20 



(Drosophila Smad4) binding site with a 5 bp spacer 
sequence (36). Indeed, Smad3 binding motifs were signifi- 
cantly enriched in the Smadl/5 binding regions found in 
our analysis (Figure 4B). The analysis of the spacer length 
between GC-SBE and SBE revealed that the 5 bp spacer 
was also prominent in HUVECs (Figure 6A), suggesting 
that the 5 bp spacer sequence has some beneficial effect for 
binding of the Smad complex, containing Smadl/5 and 



Smad4, in mammalian cells too. On the other hand, ex- 
pressions of genes associated with the GC-SBE/SBE com- 
posite motif with 5 bp spacer were not necessarily 
regulated by BMP-9 stimulation (Figure 6B). 

Next, the roles of SBE sequences, which were located at 
different distances from GC-SBE, were examined. 
Evolutionarily conserved SBE/GC-SBE composite motifs 
with a 28 bp spacer sequence were found in the BMPR2 
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intron 3 and the JAG1 promoter (Supplementary 
Figure S5A). These sequences were able to drive luciferase 
expression in reporter assays in response to BMP stimu- 
lation (Figure 5B and Supplementary Figure S5B). 
Mutations in either GC-SBE or SBE sequence showed sig- 
nificant attenuation of BMP responsiveness (Figure 5B 
and Supplementary Figure S5B), indicating that the effect- 
ive distance between GC-SBE and SBE was not restricted 
to 5 bp. The GC-SBE was not able to respond to BMP 
stimulation in luciferase reporter assays, even when pre- 
sent in six copies (Figure 5B). These findings clearly 
showed that both GC-SBE and SBE were required for 
full BMP responsiveness. Collectively, our results suggest 
that the binding affinity of Smad complexes to DNA 
is defined by the affinities of GC-SBEs and SBEs 
(Figure 6C). 

JAG1 is a direct target gene of Smadl/5 in ECs and 
transactivates Notch signaling in the neighboring cells 

EC-specific target genes contained well-known Notch- 
signal target genes and signaling components, including 
HEY1, HEY2, HES1, FOXC1, LFNG, NRARP and 
JAG1 (Figure 7A and Supplementary Table S4). 
Synergic effects between Notch and BMP signaling on 
several Notch target genes, such as HEY1 and CDH2, 
have been reported previously (13,37). However, little is 
known about direct expression regulation of Notch 
ligands by BMP signaling in ECs. 

Two strong Smadl/5 binding regions were identified in 
the JAG1 locus, in the promoter region at —500 bp from 
the TSS and in the second intron (Figure 7B), which were 
verified by ChlP-qPCR (Figure 1C). Both regions worked 
as transcriptional enhancers in HUVECs (Supplementary 
Figure S5B). Consistent with the results of the reporter 
assays, BMP-9 was able to induce expression of JAG1 
mRNA (Figure 7C). TNF-oc has been shown to induce 
JAG1 expression in ECs (38). The induction by BMP-9 
was equivalent to that of TNF-oc and also had some 
additive effects (Figure 7C). Western blot analysis 
and immunocytochemistry revealed that the JAG1 pro- 
tein was also upregulated by BMP-9 stimulation in ECs 
(Figure 7D and E). This JAG1 mRNA induction was 
not affected by CHX, and siRNA against SMAD4 
(siSmad4) attenuated BMP-9-mediated upregulation of 
JAG1 (Supplementary Figure S6A and B). These results 
showed that JAG1 is a direct target gene of BMP-Smadl /5 
pathway. 

A HeLa reporter cell system was used to verify the func- 
tion of JAG1 as a Notch ligand. HeLa cells were trans- 
fected with the Notch-specific luciferase reporter 
construct (pGL4-12xCSL-Luc), and thus responsive to 
Notch activation (26). In the absence of HUVECs, 
BMP-9 did not induce reporter activity in the transfected 
HeLa cells (Figure 7F; lanes 1 and 3). In the presence of 
HUVECs, however, BMP-9 induced strong activation 
of reporter expression (Figure 7F; lanes 2 and 4), indi- 
cating that JAG1 induced by BMP-9 in ECs was able to 
efficiently transactivate Notch signaling in neighboring 
cells. 



DISCUSSION 

In this study, genome-wide maps of Smadl/5 binding 
regions in human primary cells revealed how BR-Smads 
recognize and regulate their target genes. Both HUVECs 
and PASMCs express Smadl, Smad5 and Smad8 
(Supplementary Figure S1E). However, redundant func- 
tions between Smadl and Smad5 have been demonstrated 
in vivo, especially in the vasculature (39). Smadl +I ~\ 
Smad5 +I ~ double heterozygous mutant mice are embryon- 
ic lethal and display defects, which closely resemble those 
seen in Smadl- or Smad5 -mx\\ mice, whereas Smadl or 
Smad5 single heterozygous mice show no overt phenotype. 
Smad8-mx\\ mice additionally lacking one copy of Smadl 
or Smad5 did not exhibit overt phenotypes, and the tissue 
disturbances seen in Smadl- or Smad5-mx\\ embryos are 
not exacerbated in the absence of Smad8. These findings 
suggest that Smadl and Smad5 possess equivalent bio- 
logical functions especially in the vasculature, while 
Smad8 is dispensable. 

The mapping data of Smadl/5 showed that ~30% of 
the binding sites were located in the introns of known 
genes. Smadl/5 binding peaks of 85.4% overlapped with 
enhancer regions in HUVECs, where histone modification 
markers in basal conditions were available. Motif analysis 
revealed that binding motifs for ETS, AP-1, AP-2 and 
SP-1 were enriched in Smadl/5 binding regions regardless 
of the cell types. These motifs were also enriched in the 
Smad4 binding regions in human keratinocyte HaCaT 
cells (31). Other motifs occurred only in a small propor- 
tion of sequences analyzed. Recently, John and colleagues 
(40) reported that cell type-specific glucocorticoid receptor 
binding patterns are comprehensively predetermined by 
cell-specific differences in baseline chromatin accessibility 
patterns, with secondary contributions from local 
sequence features. The similar motif occurrence patterns 
between HUVECs and PASMCs suggest that the binding 
regions of BR-Smad are also predetermined in the specific 
cell types. 

Smadl/5 reproducibly bound to some target sites such 
as ID1 and ID3 loci with comparable enrichment after 
BMP-9 and BMP-6 stimulation, while the total number 
of Smadl/5 binding sites was dramatically lower in 
HUVECs treated with BMP-6 compared to those with 
BMP-9 (3750 versus 880). Increasing the dose of BMP-6 
up to 200 ng/ml was not enough to elicit comparable level 
of enhancer activities as 1 ng/ml BMP-9 (Figures 2C 
and 5B). This suggests that each binding site has different 
binding affinity for Smad complex and that BR-Smad sig- 
naling through ALK-2 was not enough to occupy full sets 
of target sites in ECs. This is consistent with the facts that 
HHT2 is the result of haploinsufficiency of ALK-1 (6), 
and that ALK-2 signaling is not able to compensate for 
ALK-1 mutations in HHT patients even though BMP-9 
can signal through ALK-2 (1). 

In Drosophila, Ashe et al. (41) have reported that each 
enhancer element for Mad target genes has a different 
binding affinity for Smad/Mad. A gene with low-affinity 
Smad/Mad binding sites is transcribed only in response to 
high concentrations of Dpp, while a gene with higher 
affinity sites responds to a low dose of Dpp. Increasing 
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the affinity of the Smad/Mad binding sites in the enhancer 
of the Ance (also known as Race) resulted in a wider 
expression pattern in vivo (42). We revealed that a 
mutation in our consensus GC-SBE sequences attenuates 
BMP responsiveness of target genes (Figure 5B and 
Supplementary Figure S5B). In addition, a mutation of 



the HAMP promoter from GGCGCC to GGTGCC, 
which was identified in a hemochromatosis patient, impairs 
the BMP responsiveness in vivo and contributes to the 
severe phenotype (43). These results suggest that the bind- 
ing affinity for Smad complex is the sum of the affinities of 
GC-SBEs, SBEs and other DNA binding proteins like 
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Sox2 and Oct4 in mESCs (16), and that unidentified mu- 
tations in the BR-Smad binding regions will be implicated 
in HHT or PAH. 

Collectively, our findings support the notion that BR- 
Smad binding sites are predetermined in specific cell type 
and determined by the binding affinity of Smad complex 
to possible binding sites. It suggests that the strength of 
the BR-Smad pathway is converted to the number and 
distribution of BR-Smad binding sites over the genome. 
It does not necessarily exclude the possibilities that non- 
Smad pathways play important roles. Non-Smad 
pathways have been reported to affect the BR-Smad 
pathway through degrading BR-Smads or modulating 
binding affinity of Smad complexes [for review, see (44)]. 
It is possible that they modulate the intensity of BR-Smad 
pathway and affect the number and distribution of 
Smadl/5 binding sites in ECs (Figure 6C). 

Dysregulation of Notch signaling has been reported to 
cause AVM [for review, see (45)] that is one of the major 
pathological features of HHT. JAG1 has been reported to 
cause differentiation of vascular smooth muscle cell 
(vSMC) precursor cells and induce vSMC-specific genes 
in vitro through the JAGl-Notch3 signaling pathway 
(46,47). EC-specific deletion of Jagl showed defects in 
vSMC coverage in mice (38,48). Interestingly, genetic 
and pharmacological inhibition of ALK-1 signaling 
showed a severe vascular phenotype including lack of dif- 
ferentiation and recruitment of vSMCs and defects in the 
maturation phase of angiogenesis (5,49,50). In the clinical 
settings, thalidomide has been shown to stimulate vessel 
maturation and have beneficial effects on HHT patients 
(51). Therefore, our results suggested the important roles 
of ALK-1-Smad-JAG1 pathway in the pathogenesis of the 
vascular lesions of the HHT. They also suggest that this 
pathway will be a novel therapeutic target for treatment of 
HHT. 
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